Skip to content

Enable reporting peak memory usage for gtests #18599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 30, 2025

Conversation

davidwendt
Copy link
Contributor

Description

Enables libcudf gtests to report peak memory usage after the tests complete.
The memory peak uses the rmm::mr::statistics_resource_adaptor and is triggered with the environment variable GTEST_CUDF_MEMORY_PEAK
Working on this uncovered that most of the STREAM_ based tests were not using the CUDF_TEST_PROGRAM_MAIN() and so did not support custom parameters. Also, at least one test failed the stream check after this was corrected.
The PR includes a shell script to run each test and report the peak memory for each to stdout in CSV format.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 29, 2025
@davidwendt davidwendt self-assigned this Apr 29, 2025
@davidwendt davidwendt requested review from a team as code owners April 29, 2025 22:57
@davidwendt davidwendt requested review from vyasr and shrshi April 29, 2025 22:57
@github-actions github-actions bot added the CMake CMake build issue label Apr 29, 2025
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice. Can you copy the output of the script here for posterity?

@@ -122,5 +123,8 @@ TEST_F(BinaryopPTXTest, ColumnColumnPTX)

cudf::binary_operation(
lhs, rhs, ptx, cudf::data_type(cudf::type_to_id<int32_t>()), cudf::test::get_default_stream());
cudf::binary_operation(lhs, rhs, ptx, cudf::data_type(cudf::type_to_id<int64_t>()));
cudf::binary_operation(
lhs, rhs, ptx, cudf::data_type(cudf::type_to_id<int64_t>()), cudf::test::get_default_stream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this fix! Not sure how the stream tests were passing earlier though 😕

@davidwendt
Copy link
Contributor Author

davidwendt commented Apr 30, 2025

Here is the output as of this PR

AST_TEST,67440
BINARYOP_TEST,5399987824
BITMASK_TEST,43280
CLAMP_TEST,306512
COLUMN_TEST,46096
COMPRESSION_TEST,511662928
COPYING_TEST,6442451024
CSV_TEST,11456
DATA_CHUNK_SOURCE_TEST,503673776
DATETIME_OPS_TEST,2512
DEVICE_ATOMICS_TEST,131152
DICTIONARY_TEST,3024
DISPATCHER_TEST,16
ENCODE_TEST,2240
FACTORIES_TEST,23856
FILLING_TEST,38864
FIXED_POINT_TEST,16000
FST_TEST,1641920
GROUPBY_TEST,51109968
HASHING_TEST,9664
INTEROP_TEST,3232064
IS_SORTED_TEST,3936
ITERATOR_TEST,17165776
JIT_PARSER_TEST,0
JOIN_TEST,1028178848
JSON_PATH_TEST,5376
JSON_TEST,373295616
JSON_TYPE_CAST_TEST,184736
JSON_WRITER_TEST,27712
LABEL_BINS_TEST,35488
LARGE_STRINGS_TEST,14040002160
LISTS_TEST,11216
LOGICAL_STACK_TEST,1002752
MERGE_TEST,889280
MULTIBYTE_SPLIT_TEST,864421472
NESTED_JSON_TEST,10498688
NORMALIZE_REPLACE_TEST,288
ORC_TEST,6701737280
PARQUET_TEST,2073698320
PARTITIONING_TEST,278288
QUANTILES_TEST,75563856
REDUCTIONS_TEST,12000000
REPLACE_NANS_TEST,2112
REPLACE_NULLS_TEST,452720
REPLACE_TEST,37167584
RESHAPE_TEST,6960
ROLLING_TEST,1870992
ROUND_TEST,75440
ROW_SELECTION_TEST,0
SCALAR_TEST,2768
SEARCH_TEST,17600
SORT_TEST,442320
SPAN_TEST,1024
STREAM_BINARYOP_TEST,128
STREAM_COLUMN_VIEW_TEST,
STREAM_COMPACTION_TEST,9981312
STREAM_CONCATENATE_TEST,432
STREAM_COPYING_TEST,355056
STREAM_DATETIME_TEST,112
STREAM_DICTIONARY_TEST,1920
STREAM_FILLING_TEST,864
STREAM_GROUPBY_TEST,2448
STREAM_HASHING_TEST,640
STREAM_IDENTIFICATION_TEST,
STREAM_IO_CSV_TEST,10192
STREAM_IO_JSON_TEST,5968
STREAM_IO_MULTIBYTE_SPLIT_TEST,604374592
STREAM_IO_ORC_TEST,33968
STREAM_IO_PARQUET_TEST,101744
STREAM_JOIN_TEST,3840
STREAM_LABELING_BINS_TEST,496
STREAM_LISTS_TEST,2240
STREAM_MERGE_TEST,889280
STREAM_NULL_MASK_TEST,272
STREAM_PARTITIONING_TEST,1840
STREAM_POOL_TEST,0
STREAM_QUANTILE_TEST,1664
STREAM_REDUCTION_TEST,992
STREAM_REPLACE_TEST,1920
STREAM_RESHAPE_TEST,1136
STREAM_ROLLING_TEST,1904
STREAM_ROUND_TEST,96
STREAM_SCALAR_TEST,32
STREAM_SEARCH_TEST,432
STREAM_SORTING_TEST,2496
STREAM_STREAM_COMPACTION_TEST,2416
STREAM_STRINGS_TEST,3720096
STREAM_TEXT_TEST,5414928
STREAM_TRANSFORM_TEST,4256
STREAM_TRANSPOSE_TEST,10256
STREAM_UNARY_TEST,80
STRINGS_TEST,4401456
STRUCTS_TEST,11296
TABLE_TEST,2480
TEXT_TEST,10823232
TIMESTAMPS_TEST,2032
TRAITS_TEST,0
TRANSFORM_TEST,152720
TRANSPOSE_TEST,962272
TYPE_INFERENCE_TEST,272
UNARY_TEST,245440
UTILITIES_TEST,8326080

Perhaps there is a data-frame-ish library we could use to figure out how to group these appropriately for parallel processing by ctest.

@bdice
Copy link
Contributor

bdice commented Apr 30, 2025

I would solve this a little differently -- perhaps more simply than the "optimal" strategy but easier to enforce.

Currently we run our tests in CI with -j20, but we default all tests to requiring 15% of the GPU (so a maximum of 6 tests are able to run in parallel). Our lowest VRAM GPU in CI is an L4 with 24 GB. We should be able to safely run any tests that require <1 GB of memory in parallel (requiring 20 GB at most). We only have 4 tests using more than that (shown below).

I filed PR #18603 with a proposal to run ORC_TEST, COPYING_TEST, BINARYOP_TEST, and PARQUET_TEST in isolation (100% of the GPU), while allowing all other tests to be run 6-way parallel (all other tests require 15% of the GPU).

ORC_TEST,6.24 GB
COPYING_TEST,6.00 GB
BINARYOP_TEST,5.03 GB
PARQUET_TEST,1.93 GB
JOIN_TEST,0.96 GB
MULTIBYTE_SPLIT_TEST,0.81 GB
STREAM_IO_MULTIBYTE_SPLIT_TEST,0.56 GB
COMPRESSION_TEST,0.48 GB
DATA_CHUNK_SOURCE_TEST,0.47 GB
JSON_TEST,0.35 GB
QUANTILES_TEST,0.07 GB
GROUPBY_TEST,0.05 GB
REPLACE_TEST,0.03 GB
ITERATOR_TEST,0.02 GB
REDUCTIONS_TEST,0.01 GB
TEXT_TEST,0.01 GB
NESTED_JSON_TEST,0.01 GB
STREAM_COMPACTION_TEST,0.01 GB
UTILITIES_TEST,0.01 GB
STREAM_TEXT_TEST,0.01 GB
STRINGS_TEST,0.00 GB
STREAM_STRINGS_TEST,0.00 GB
INTEROP_TEST,0.00 GB
ROLLING_TEST,0.00 GB
FST_TEST,0.00 GB
LOGICAL_STACK_TEST,0.00 GB
TRANSPOSE_TEST,0.00 GB
MERGE_TEST,0.00 GB
STREAM_MERGE_TEST,0.00 GB
REPLACE_NULLS_TEST,0.00 GB
SORT_TEST,0.00 GB
STREAM_COPYING_TEST,0.00 GB
CLAMP_TEST,0.00 GB
PARTITIONING_TEST,0.00 GB
UNARY_TEST,0.00 GB
JSON_TYPE_CAST_TEST,0.00 GB
TRANSFORM_TEST,0.00 GB
DEVICE_ATOMICS_TEST,0.00 GB
STREAM_IO_PARQUET_TEST,0.00 GB
ROUND_TEST,0.00 GB
AST_TEST,0.00 GB
COLUMN_TEST,0.00 GB
BITMASK_TEST,0.00 GB
FILLING_TEST,0.00 GB
LABEL_BINS_TEST,0.00 GB
STREAM_IO_ORC_TEST,0.00 GB
JSON_WRITER_TEST,0.00 GB
FACTORIES_TEST,0.00 GB
SEARCH_TEST,0.00 GB
FIXED_POINT_TEST,0.00 GB
CSV_TEST,0.00 GB
STRUCTS_TEST,0.00 GB
LISTS_TEST,0.00 GB
STREAM_TRANSPOSE_TEST,0.00 GB
STREAM_IO_CSV_TEST,0.00 GB
HASHING_TEST,0.00 GB
RESHAPE_TEST,0.00 GB
STREAM_IO_JSON_TEST,0.00 GB
JSON_PATH_TEST,0.00 GB
STREAM_TRANSFORM_TEST,0.00 GB
IS_SORTED_TEST,0.00 GB
STREAM_JOIN_TEST,0.00 GB
DICTIONARY_TEST,0.00 GB
SCALAR_TEST,0.00 GB
DATETIME_OPS_TEST,0.00 GB
STREAM_SORTING_TEST,0.00 GB
TABLE_TEST,0.00 GB
STREAM_GROUPBY_TEST,0.00 GB
STREAM_STREAM_COMPACTION_TEST,0.00 GB
ENCODE_TEST,0.00 GB
STREAM_LISTS_TEST,0.00 GB
REPLACE_NANS_TEST,0.00 GB
TIMESTAMPS_TEST,0.00 GB
STREAM_DICTIONARY_TEST,0.00 GB
STREAM_REPLACE_TEST,0.00 GB
STREAM_ROLLING_TEST,0.00 GB
STREAM_PARTITIONING_TEST,0.00 GB
STREAM_QUANTILE_TEST,0.00 GB
STREAM_RESHAPE_TEST,0.00 GB
SPAN_TEST,0.00 GB
STREAM_REDUCTION_TEST,0.00 GB
STREAM_FILLING_TEST,0.00 GB
STREAM_HASHING_TEST,0.00 GB
STREAM_LABELING_BINS_TEST,0.00 GB
STREAM_CONCATENATE_TEST,0.00 GB
STREAM_SEARCH_TEST,0.00 GB
NORMALIZE_REPLACE_TEST,0.00 GB
STREAM_NULL_MASK_TEST,0.00 GB
TYPE_INFERENCE_TEST,0.00 GB
STREAM_BINARYOP_TEST,0.00 GB
STREAM_DATETIME_TEST,0.00 GB
STREAM_ROUND_TEST,0.00 GB
STREAM_UNARY_TEST,0.00 GB
STREAM_SCALAR_TEST,0.00 GB
DISPATCHER_TEST,0.00 GB
JIT_PARSER_TEST,0.00 GB
LARGE_STRINGS_TEST,0.00 GB
ROW_SELECTION_TEST,0.00 GB
STREAM_COLUMN_VIEW_TEST,0.00 GB
STREAM_IDENTIFICATION_TEST,0.00 GB
STREAM_POOL_TEST,0.00 GB
TRAITS_TEST,0.00 GB

@davidwendt
Copy link
Contributor Author

Updated

LARGE_STRINGS_TEST,14040002160

This is 14GB

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving C++ tests.

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit ea0ff25 into rapidsai:branch-25.06 Apr 30, 2025
112 checks passed
@davidwendt davidwendt deleted the gtest-memory-peak branch April 30, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants